Efficient Inter-Task Communication for Nested Loop Programs on a Multiprocessor System
نویسندگان
چکیده
In modern multiprocessor systems, processors can be stalled by inter-task communication when reading from a remote buffer. This paper presents a solution for the inter-task communication, that has a minimal impact on the performance of the system, hides the inter-task communication latency without requiring additional hardware. The solution applies to jobs, represented as task graphs, where the tasks are nested loop programs. Buffers are allocated in scratch-pad memories of the consuming tasks to provide low latency read access. For the nested loop programs, minimal buffer sizes can be determined to cover all possible communication patterns. The added computational complexity is low, as the solution adds only a few operations to the nested loop programs. Keywords—Nested Loop Program, Scratch-Pad Memory, Circular Buffer.
منابع مشابه
Free Scheduling of General Nested Loops For Distributed Memory Architectures
The most extensive, in terms of time execution, part of a program is the nested loops. Loop parallelization involves two steps: First the time partitioning of the index space to achieve the minimum makespan, and second the efficient assignment of the concurrent partitions into the target parallel architecture. If distributed memory multiprocessor systems are used, overall performance is decline...
متن کاملParallelization of While-Loops in Nested Loop Programs for Real-time Multiprocessor Systems
Many applications with stream processing behavior contain one or more loops with an unknown number of iterations. These loops have to be parallelized in order to utilize the maximum capacity of an embedded multiprocessor platform and thus increase the total throughput. This thesis presents a method to automatically extract a parallel task graph based on function level parallelism from a sequent...
متن کاملPre-scheduling and Scheduling of Task Graph on Homogeneous Multiprocessor Systems
Task graph scheduling is a multi-objective optimization and NP-hard problem. In this paper a new algorithm on homogeneous multiprocessors systems is proposed. Basically, scheduling algorithms are targeted to balance the two parameters of time and energy consumption. These two parameters are up to a certain limit in contrast with each other and improvement of one causes reduction in the othe...
متن کاملPre-scheduling and Scheduling of Task Graph on Homogeneous Multiprocessor Systems
Task graph scheduling is a multi-objective optimization and NP-hard problem. In this paper a new algorithm on homogeneous multiprocessors systems is proposed. Basically, scheduling algorithms are targeted to balance the two parameters of time and energy consumption. These two parameters are up to a certain limit in contrast with each other and improvement of one causes reduction in the othe...
متن کاملCache Optimization for Coarse Grain Task Parallel Processing Using Inter-Array Padding
The wide use of multiprocessor system has been making automatic parallelizing compilers more important. To improve the performance of multiprocessor system more by compiler, multigrain parallelization is important. In multigrain parallelization, coarse grain task parallelism among loops and subroutines and near fine grain parallelism among statements are used in addition to the traditional loop...
متن کامل